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Abstract 

A plausible consequence of rugged energy landscapes of biomolecules is that functionally competent 
folded states may not be unique, as is generally assumed. Indeed, molecule-to-molecule variations 
in the dynamics of enzymes and ribozymes under folding conditions have recently been identified 
in single molecule experiments. However, systematic quantification and the structural origin of the 
observed complex behavior remain elusive. Even for a relatively simple case of isomerization dynamics 
in Holliday Junctions (HJs), molecular heterogeneities persist over a long observation time {T bs ~ 
40 sec). Here, using concepts in glass physics and complementary clustering analysis, we provide 
a quantitative method to analyze the smFRET data probing the isomerization in HJ dynamics. We 
show that ergodicity of HJ dynamics is effectively broken; as a result, the conformational space of HJs is 
partitioned into a folding network of kinetically disconnected clusters. While isomerization dynamics in 
each cluster occurs rapidly as if the associated conformational space is fully sampled, distinct patterns 
of time series belonging to different clusters do not interconvert on T bs- Theory suggests that persistent 
heterogeneity of H J dynamics is a consequence of internal multiloops with varying sizes and flexibilities 
frozen by Mg 2+ ions. An annealing experiment using Mg 2+ -pulse that changes the Mg 2+ cocentration 
from high to low to high values lends support to this idea by explicitly showing that interconversions 
can be driven among trajectories with different patterns. 
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I. INTRODUCTION 



Although challenged occasionally [IH3], it has long been considered as a general principle 
that the folded states of biomolecules are uniquely determined by their sequences and envi- 
ronmental conditions [4J. Observation of multiple folding paths and the dependence of the 
folding routes on the initial conditions [SHSj that illustrate the ruggedness of the folding land- 
scapes [U [9j Uni [El EH] have illustrated the complexity of biomolecular foldings. These findings 
by themselves do not challenge the notion of a unique native state. However, recent findings 
from single molecule experiments on several biomolecular systems explicitly showed persistent 
heterogeneities in time traces (or molecule-to-molecule variations) generated under identical 
folding conditions [TJ CE21 [T4HT7] . Unlike phenotypic cell-to-cell variability among genetically 
identical cells that can be visualized using microscope [18], the heterogeneity among individual 
biomolecules on much small length scales is tantalizing in that it is difficult to reconcile with 
the conventional notion that functional states of proteins and RNAs are unique or that various 
native basins of attraction easily interconvert. For example, in docking-undocking transitions of 
surface immobilized hairpin ribozyme pQ and Tetrahymena group I intron ribozyme [16], each 
time trace for individual molecule displays very different dynamic pattern with a long memory 
without apparent compromise in catalytic efficiency; thus it was suggested that these ribozymes 
have multiple native states [16] . Control experiments under vesicle encapsulation still displayed 
heterogeneities as those under surface-immobilization, which indicates that the heterogeneities 
are intrinsic to molecules being probed, and is not an instrumental artifact [T2l [T5l [T9] . Given 
the ubiquity of molecule-to-molecule variations in single molecule experiments, it behooves us to 
devise analytic tools to quantify the observations using a rigorous theoretical treatments based 
on statistical mechanics. Furthermore, a molecular system that is simpler than the structured 
RNAs mentioned above but still displays persistent conformational heterogeneity could allow 
us to glean the molecular origin of the heterogeneity. To this end we study the dynamics of 
Holliday Junctions that undergo globally a simple two-state like isomerization transition in the 
presence of Mg 2+ ions, but reveal a complex behavior when examined in detail. 

Time series data from single molecule measurements can reveal the rate of conformational 
space navigated by a molecule [20, 21]. Thus, the manifestation of molecule-to-molecule variation 
in the measured trajectories, which is often hard to quantify because the time series of an 
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observable in single molecule measurement of biomolecules results from a projection of dynamics 
in high dimensional space onto a lower dimension, implies that each molecule samples only a 
small subset of the entire conformational space on T bs- The extent to which a trajectory samples 
the allowed conformational space depends on the length of the observation time T bs> Conversely, 
states hidden in multiple deep furrows of the folding landscapes at high dimension restricts the 
dynamics of each molecule to one of many states that are non-interconvertible within T bs- 

Historically the widely accepted notion of unique native state in biomolecules was hypothe- 
sized based on bulk measurements, where an averaged property of a probe variable is obtained 
from an ensemble of snapshots. Such a conclusion assumes ergodicity, i.e., the equivalence be- 
tween time and ensemble average of an observable. Although ergodicity is a necessary condition 
for equilibrium systems, it is in practice difficult to realize because of unlikelihood that in a 
single time trace a molecule can sample the entire configurational space [22J. Furthermore, the 
situation is further exacerbated because nominally the observation time in practice, or relevant 
time scales for many biological phenomena are limited. In a rugged landscape, a molecule with 
an initial conformation distinct from others would repeatedly sample distinct region of the fold- 
ing landscape for a long observation time, which could be longer than "biologically relevant 
time scale". This scenario results in heterogeneous dynamics, and ensemble averaging would 
obscure the complexity of the structural features of the underlying landscape. Thus, in this 
sense the ergodicity of the system is "effectively" broken. Now that molecular heterogeneities 
are clearly demonstrated in many single molecule data for a variety of unrelated systems it is 
a major theoretical challenge to devise practical tools to reconstruct rugged folding landscapes 
from time series. 

Here, we perform smFRET experiments [23], and use concepts from glass physics [2j [25] 
and complementary clustering algorithms j3j [4] to make systematic analysis on smFRET data 
of metal-ion driven conformational changes in a Holliday Junction (HJ) with a DNA sequence 
that disallows branch migration [6j. We show quantitatively that although the HJ dynamics 
at the ensemble level can be pictured using a two-state model, the ergodicity of the system 
is effectively broken. The associated folding landscape of HJs is visualized in terms of rarely 
interconverting multiple ordered states embedded in the two isoforms of HJs. Furthermore, the 
simplicity of HJ structure is explored to discover the structural origin of the heterogeneities in 
dynamics. The presence of internal multiloop topologies with varying sizes and flexibilities at 
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the junction and the high local concentration of Mg 2+ ions around it, the latter of which is 
calculated using molecular dynamics simulation, lead us to propose that the heterogeneities in 
H J isomerization are due to non-interconverting multiloop topologies "pinned" by the complexed 
Mg 2+ ions. We validate the structural explanation by showing that trajectories with different 
patterns interconvert by an annealing protocol involving cycling the Mg 2+ ions from high to low 
to high concentrations. 

II. RESULTS AND DISCUSSION 

smFRET measurements of HJ dynamics and analysis using ensemble averaging. In 

DNA recombination, HJs are essential intermediates for strand exchange (Fig. la) [29]. At Mg 2+ 
concentrations exceeding ~ 50 /iM, HJs exist in two distinct isoforms (iso-I and iso-II) both of 
which have the characteristic X-shaped architectures. According to previous studies 0130], in 
the absence of divalent ions HJs have stable open square structure whose apparent FRET effi- 
ciency value is E « 0.3, but at [Mg 2+ ]> 50 /xM make conformational changes between two stable 
conformers iso-I and iso-II, via the open square structure. We performed smFRET experiments 
for a range of Mg 2+ ion concentrations to monitor the Mg 2+ -dependent isomerization dynamics 
of surface-immobilized HJs by attaching Cy3 and Cy5 dyes to the terminus of X and R branches 
(Fig. la. See Supplementary Text 1). The time dependent FRET efficiency Ei(t) for z-th 
molecule (i = 1, 2, . . . N) was calculated by taking the ratio between the emission signals {I Aj.it) 
and lD,i(t)) from acceptor and donor dyes using Ei(t) = 1 A,i(t) / '{1 A,i(t) + lD,i(t)) (Fig. S3). 
The individual time trajectories of Ei(t) monitored for T bs ~ 40 sec at [Mg 2+ ]> 0.5 mM shows 
multiple transitions between high (E ~ 0.5) and low FRET (E ~ 0.2) values, corresponding 
to the iso-I and iso-II conformers, respectively. Histograms of FRET values collected over the 
entire observation time and population, P ens [E\ reveal bimodal distributions for [Mg 2+ ]> 0.5 
mM, all of which can nicely be fit to double-Gaussian functions, and unimodal distribution for 
[Mg 2+ ] = 0.0 mM (Fig. lb and Supplementary Fig. S4). With increasing [Mg 2+ ], the two 
peaks on P ens (E) separate more clearly, indicating that the transitions is becoming increasingly 
cooperative. In addition, the ensemble average of FRET efficiency (E) = J 1 EP ens (E)dE is 
invariant at [Mg 2+ ]> 0.5 mM (Supplementary Fig. S4), which is in accord with the previous 
finding [6] that showed that the relative population of the two isoforms is entirely controlled by 
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the junction sequences and not by the ionic conditions. 

In addition to the ensemble- averaged thermodynamic measure P ens {E) ) transition kinetics, 
which occurs on an average time of tj^u ~ (0.1 — 1) sec, between the two isomers provide 
a glimpse into how HJ isomerizations occur over varying Mg 2+ conditions. As suggested 
by the shapes of the Mg 2+ ion-dependent P ens (E) (Supplementary Fig. S4), the transitions 
between the two conformers slow down with increasing Mg 2+ ion concentration, presumably 
due to an increase in the free energy barrier at higher concentration. The survival probabilities 
(£(£)) calculated from the dwell time distributions (pdweii(t)) for both high and low FRET 
values E(t) = 1 — J* Pdweii(^)dr can be approximately fit using a single exponential function 
although 5 % of population exhibit a deviation from the fit. Thus, it is tempting to surmise 
that an approximate two-state picture is adequate to describe the transition between the two 
means [6] (Fig. lb and Supplementary Fig. S5). 

Molecule-to-molecule variation in individual time traces. The two-state model for the 
Mg 2+ ion-dependent isomerization dynamics of the HJ, gleaned from averaging over an ensemble 
of molecule, ignores the intrinsic heterogeneities among the trajectories. However, an inspection 
of a few disparate trajectories makes clear the dramatic variations between individual molecules. 
In Fig. 2a, one can immediately identify the differences between the three exemplary trajectories 
that maintain the characteristic pattern of dynamics on T bs- The multiple transitions in each 
time trace render the time average of i.e., Si(t) = \ f£ drE^r) stationary, which suggests 

that the conformational space associated with that particular trajectory is exhaustively sampled. 
Thus, using such a trajectory, it is legitimate to calculate a stationary distribution p 8 {E\i) = 
lim t ^7; 6s p(E, t\ i) [31] . which represents the population of microstates probed on T bs i n terms 
of FRET efficiency value. Notably, despite a number of rapid transitions in each time trace, a 
finding that is nominally associated with canonical ergodic sampling of conformational space, 
there are qualitative differences between p s (E\i) and P ens (E), and among p s (E;i)s with i = 
1, 2, 3, indicating that the ergodicity of HJ dynamics is effectively broken on T bs ~ 40 sec. 

In the literature the overall variation in dynamics and the associated heterogeneities are often 
conveniently visualized in the form of scatter plot of the mean dwell times at low and high FRET 
signals for each molecule as in Fig. 2b [T5l 1321 [33] . The mean dwell times for the three time 
traces in Fig. 2a are encircled in Fig. 2b. Note that the scattered data points in Fig. 2b should 
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in principle distribute in a single spot if the HJ dynamics were truly ergodic. 

At an abstract level, a landscape picture implicated from the above findings is the following: 
For a given time trace belonging to a dynamical pattern, corresponding to a specific molecule a, 
Tobs ~ 40 sec is long enough to observe multiple isomerization events, so that the time scale for 
single isomerization between iso-I and iso-II (r^ u ) is much smaller than Tobs ( T iUn ^ %bs)- 
Thus, HJ explores the conformations in the a state exhaustively; however it is not long enough 
for interconversion to take place between the pattern a and another pattern, say /3, i.e., 
Tobs <C T conv where r^f is the interconversion time between a and f3 states, implying that 
a substantially high kinetic barrier separates the states a and /3. Therefore, dynamics of HJs 
are effectively ergodic within each state on 7^ 5 , but Tobs is not long enough to ensure ergodic 
sampling of the entire conformational space — a situation that is reminiscent of ergodicity 
breaking in supercooled liquids [2j (see also Supplementary Text 2). 

Assessing ergodicity from time series. In order to demonstrate quantitatively that in- 
terconversion between the multiple states of HJ implicit in smFRET trajectories is unlikely 
we use concepts in glass physics [34]. If the entire conformational space of the HJ is ef- 
fectively sampled during Tobs, which would establish ergodicity, then the time averaged 
i.e., £i(t) should converge to an ensemble average for all i. We use a time-dependent metric 



^E(t) = X^Li y e i(t) ~ £ (t)j w ^h e(i) = YliLi e i{^)-> introduced to probe the approach to 
equilibrium in the context of simulations of supercooled liquids and glasses [21 134] , to analyze 
smFRET time series data. For "ergodic" systems, f2g(t) converges to zero at t — >• oo. For 
finite time t with fi#(£) 7^ 0, it can be shown that f2g(t) decays as t~ l asympotically; thus 
[^(^/^(O)] -1 ~ D E t (see Supplementary Text 3 for further details). Because the form of the 
metric f2#(t) is similar to the mean square displacement, the slope of the [^£;(£)/^£;(0)] -1 = 
De, is interpreted either as an effective diffusion constant or an effective sampling rate of the 
conformational space projected onto the FRET efficiency coordinate. 

For the entire ensemble of HJ trajectories, however, we find that [^(t)] -1 is not linear in 
time, converging to a finite value. The nonlinearity of [^(t)] -1 indicates that the conformations 
belonging to distinct states do not mix on 7^ 5 , which implies effective ergodicity breaking 
in the HJ dynamics on Tobs (Fig. 3b) [2]. Remarkably, this conclusion still holds for HJ 
trajectories probed on an extended time scale T Q bs ~ 70 sec (see Supplementary Fig. S6). 
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Notably, the variance at T bs-> quantified using Q E (7^6 a )/fi^(0), is greater at higher [Mg 2+ ] (Fig. 
3b), suggesting that interactions with Mg 2+ ions are responsible for creating the heterogeneous 
environment for HJ molecules. 

Partitioning the conformational space using complementary clustering algorithm. 

How many states that do not interconvert on T bs are needed to fully account for the ex- 
perimental data? In order to answer this question we complement the general approach 
discussed above to quantitatively assess ergodicity from time trajectories with K-means 
clustering algorithm. This allows us to partition the conformational space of HJ into multiple 
"ergodic subspaces". The K-means clustering algorithm partitions data into K clusters, so 
that the distance of data belonging to the cluster to the cluster mean is minimized and the 
inter-cluster distance are maximized. In our problem, the data correspond to the stationary 
distributions of individual time trajectories. We partitioned the set of stationary distributions 
{p s (E]i)\i = I,-- - , N} into K clusters, ensuring that [^(t)] -1 ~ t in each cluster (see 
Supplementary Texts 4, 5 and Fig. S7 for further detail) so that within each cluster the 
dynamics is ergodic. The consequences of our analysis are summarized as follows: (i) For 
[Mg 2+ ]=50 mM, {p s (E]i)\i = 1, • • • , N} are partitioned into five clusters (ergodic subspaces) 
(Fig. 4a. See also Supplementary Text 5 and Fig. S7). (ii) The information of data list in 
each cluster enables us to partition the set of time averaged trajectories ({£*(£)}) (Fig. 4b) 
and scatter plot of dwell times (Fig. 4c), shown in Fig. 3a and Fig. 2b, respectively, (hi) The 
effective diffusion constant D E in i?-space associated with the conformational sampling of HJs 
varies widely from one ergodic subspace to another (Fig. 4d). If we assume that the subspace 
k = 2, with the largest D E (Fig. 4d), has a smooth landscape (e&=2 = 0), the roughness scale 
for the subspace k = 4 is e k=4 = ^log (0.5/0.07) « 1.4 k B T from D E = D° E e~^ e2 [351 E29. 
Comparison between Figs. 4c and 4d shows that the large heterogeneities of dwell times in the 
scatter plot are mainly due to the molecule belonging to the ergodic subspace with small D E 
(e.g. k = 1,4,5). The physical criterion, namely the dynamics in each subspace be ergodic, 
imposed in our clustering method makes our analysis unique, providing further glimpses into the 
details of folding landscapes, which are masked in ensemble average quantities P ens (E) and £(t). 

Structural origin of molecular heterogeneity. What is the structural origin of multiple 
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states in HJ, which leads to dynamics heterogeneity? Simplicity of the HJ structure allows 
us to infer the structural origin of the multiple states and their roles in complex dynamics. 
First, our calculations of the electrostatic potential and ion distribution using 100 ns molecular 
dynamics simulations show that Mg 2+ ions are localized near the junction region and grooves 
that have high negative charge density (Fig. 5a). Second, the results from m-fold algorithm 
[37] indicate that the open square form of HJ, representing the secondary structure of HJ that 
forms transition state (TS) at the top of the path connecting the two isoforms at high [Mg 2+ ] 
can have a spectrum of distinct internal multiloop topologies at the junction (Fig. 5b). In the 
absence of branch migration, which is ruled out in our experiments, it is conceivable that the 
topology of the internal multiloop with varying sizes and flexibilities determines both the rate of 
HJ isomerization and inter-dye distance. Variations in isomerization rate and inter-dye distance 
are reflected in {p s (E\i)} that are partitioned into five clusters. Taking these results together, 
we argue that the secondary structure of HJ especially at the internal multiloop, which mediate 
the conformational transition between the two isoforms, is pinned by Mg 2+ ions. Consequently, 
the structural rearrangement needed for interconversion between two distinct states within a 
given isoform is prevented. Note that the TS ensemble is quantized and that the actual free 
energy gaps in the spectrum of transition states in the presence of Mg 2+ ions could be larger 
than the calculated values because the m-fold algorithm does not include the effect of the specific 
binding of multivalent ions (Control experiments show that monovalent (Na + ) ions with high 
concentrations do not lead to bimodal isomerization. See Supplementary Fig.S9). In principle, 
structural rearrangement could occur via transitions from high to low free energy QTSs (Fig. 5b) 
between the quantized TSs (QTSs), thus allowing for interconversion between the five states. 
However, the QTSs are only transiently populated during the isomerization process because 
transition path times are considerably shorter than the time to cross free energy barriers [38] . 
In other words, the life times of the QTSs are very short, so that the rearrangement of local 
bubbles at the transition state at high Mg 2+ concentrations is unlikely. The transient nature 
of the QTSs implies that during multiple rounds of isomerization the local bubble structures 
are quenched. In addition to Fig. 3b that shows greater molecular heterogeneity at high Mg 2+ 
condition, our idea of non-interconverting QTS in the presence of Mg 2+ ions finds support in the 
previous studies [29J, [39] , which showed that the obligatory open square form intermediates slow 
down DNA branch migration by ^1,000 fold in the presence of Mg 2+ ions because this process 
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requires the rupture and formation of base pairs. 

The calculations, summarized in Fig. 5, explain our experimental findings. In the folding 
landscape of HJ emerging from our analyses (Fig. 5c), transitions are only allowed between 
iso-I and iso-II via "a band of QTSs" within which the free energy gap is small enough to allow 
interconversion on T bs- Lack of transitions between two different states (say a and f3) within a 
given isoform (T bs ^ T cottv) * s explained by noting that rupture of Mg 2+ -stabilized base pairs 
are required for rearrangements from one multiloop topology to another. The conformational 
space connecting iso-I and iso-II is partitioned into a number of kinetically disjoint states 
(£ = a,/3,7...), reflecting the band structure of the QTS ensemble. In this sense the persistent 
pattern of a smFRET trajectory is an imprint of specific disjoint states in the rugged folding 
landscape. 

Mg 2+ pulse annealing experiments. An immediate prediction of our model (Fig. 5c) is 
that interconversion between states a and f3 should be facilitated by an annealing protocol, 
enabling the release of Mg 2+ ions from frozen internal multiloop structures. In order to 
validate this prediction we performed single molecule experiments using a Mg 2+ pulse sequence 
[Mg 2+ ] = 50 mM — » mM — >• 50 mM to induce transition between multiple states (Fig. S2). 
The annealing experiments confirmed that washing the Mg 2+ ions from HJ molecules indeed 
facilitates interconversion between trajectories with distinct patterns (compare the trajectories 
or two p s (E;i)s shown on the side of each panel in Fig. 6a calculated from the blue and red 
intervals of the trajectories, corresponding to the moment before and after the Mg 2+ pulse). 
We also calculated the Euclidean distance of p s {E]i) to the centroid of the five clusters in 
Fig. 4a before and after the Mg 2+ pulse annealing experiments. Here, the centroid is the 
arithmetic mean of {p s (E;i)\i 6 k] with k = 1,2, •• -5 (Fig. 4a). The distances of p s (E;i) 
to the cluster means change after the Mg 2+ pulse, which is also reflected in the reshuffling 
of the population of the HJ molecules among the five clusters. Consequently, transitions 
that are prohibited on the time scale of 40 sec are induced as seen by a redistribution of the 
population among the five distinct clusters (Fig. 6b). Resetting the initial memory of each 
HJ molecule is achieved by temporarily removing the Mg 2+ from the solutions, enabling the 
conformational interconversions in the otherwise non-interconverting HJ time traces. The 
observation of facilitated interconversion using Mg 2+ pulse corroborates the idea that Mg 2+ ions 
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and their interaction with HJ structure are the major cause of the persistent conformational 
heterogenetiy in HJ, ruling out the possible contributions to the molecular heterogeneity from 
experimental artifacts such as aforementioned surface immobilization [15], heterogeneous dye 
stacking with DNA bases [40j, or chemical modifications [33j (Supplementary Fig. S10 shows 
the non-responsiveness of donor-only tagged HJs to the Mg 2+ pulse, corroborating that the 
Mg 2+ ions affect the internal conformational dynamics, not the dye stacking or dye-to-surface 
interaction). To recapitulate, in HJs, Mg 2+ ions play a key role not only in slowing down the 
isomerization by stabilizing the native states in two isoforms (Supplementary Figs. S4, S5) 
but also in reducing the probability of interconversion between two different HJ trajectories by 
enlarging the free energy gap of the QTS ensemble (Fig. 5b). 



III. CONCLUSIONS 

Single molecule measurements provide new avenues to probe the dynamics of biological 
systems by providing information not available in ensemble experiments. However, most of the 
current experimental studies build distribution of observable or make dwell time analysis by 
ensemble averaging, thus overlooking molecular heterogeneity. Our novel theoretical analysis 
employing the concept of ergodicity breaking provides a practical framework to analyze single 
molecule data and to decipher complex folding landscapes of biological systems. Only by 
quantitatively analyzing each trajectory individually, without succumbing to the temptation 
to average, the dynamical complexity of biological molecules can be fully revealed. Indeed, as 
noted recently, quantifying and understanding the consequence of non-ergodic behavior of RNA 
molecules is a major challenge [41], and the present work provides the framework for meeting it. 
Finally, it is worth noting that although we have used HJ as example to explore quantitatively 
the concept of heterogeneity our conclusions are far reaching. We expect similar behavior in 
biological systems spanning spatial scales from nm to several microns (cell dimensions) and 
time scales from /xs to minutes and longer. 
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FIG. 1: HJ dynamics probed with smFRET experiments and their analysis using conventional ensemble 
averaging method, a. A schematic of strand exchange in DNA recombination (top) and the two isoforms 
connected by the open square structure (bottom). The Cy5 (magenta) and Cy3 (green) dyes attached 
to the B and H branches for the smFRET measurement are depicted as spheres, b. Shown are the 
part of FRET time traces ({Ei(i)} with i = 1,2, . . . , N with N = 315) obtained for individual HJ 
molecules at [Mg 2+ ] = 50 mM. Similar to the findings in the literatures, the conventional analysis 
using the ensemble of these time traces without deliberating the molecule-to-molecule variation gives 
rise to an interpretation of the apparent two-state behavior for HJs. The ensemble averaged histogram 
of the FRET efficiency E, i.e., P ens (E), is nicely fit to a double-Gaussian curve (blue line), and the 
dwell time distribution (bottom panel) for low (data in green) and high (data in blue) FRET states 
are approximately fit to single exponential functions (red lines). 
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FIG. 2: Molecule-to-molecule variation (or molecular heterogeneity) manifested in the time traces of 
isomerization dynamics of HJs a. Three disparate FRET time traces Ei{t) at [Mg 2+ ]=50 mM (blue), 
their time average Si{t) (red), and the corresponding histograms on the right. Because each time trace 
can be considered stationary as clearly indicated in the time-independence of £^(t), it is legitimate to 
build a histogram for each time trace and designate the histogram a stationary distribution, p s (E] , i). 
b. Molecular heterogeneity revealed in the scatter plot of average dwell times at low and high FRET 
state for individual time traces ((tz^), (t/j^)). The encircled data points correspond to the three time 
traces in a. 
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FIG. 3: Probing the ergodicity breaking, a. Probes of ergodic behavior in smFRET trajectories using 
the metric Q E (t) at [Mg 2+ ]=50 mM. ^(t), e(t), and Q E (t)/Q E (0): (defined in the text) are shown in 
black, green, and red lines, respectively, b. The nonlinearity of [^(t)] -1 for [Mg 2+ ]> 1 mM shows 
that the HJ dynamics is non-ergodic for all concentrations on the observation time T i s . 
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FIG. 4: Partitioning the molecules into distinct clusters, a. K-means clustering algorithm combined 
with ergodic criteria partitions the set of stationary distributions {p s (E; i)} into 5 clusters for [Mg 2+ ] = 
50 mM, and determines the list of time traces that belongs to the clusters from k = 1 to 5. b. The 
list of time traces for each cluster determined in a is used to partition {£i(t)} into {si(t)\i E k} for 
k = 1,2, -"5. c. De value are calculated from the fits using ^(0)/^^) ~ De^ for each cluster 
(k = 1, . . . , 5). d. Clustering of dwell time data as a result of the {p s (E; i)} clustering. 
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FIG. 5: Structural model to account for the origin of molecule-to- molecule variation in the HJ dynamics, 
a. (left) Electrostatic potential calculated at 200 mM monovalent ion condition using an X-ray structure 
of HJ (PDB code: 1DCW) (see Supplementary Text 6). The energy scale for the potential is in fc^T/e 
unit, (middle, right) 100 usee MD simulation at T = 310 K (Supplementary Text 7) shows that 
Mg 2+ ions are localized more at the junction and grooves than at the terminus, b. HJs with various 
topologies of internal multiloops, which are the putative QTSs (quantized transition states) connecting 
one state in iso-I and another in iso-II. c. Model for the dynamics of HJ constructed based on smFRET 
experiments and simulations. On the left are the free energy contours for various states. Two isoforms 
in each state are connected by a distinct open square form, whose structures are shown in B. Ensemble 
averaged distribution of the FRET efficiencies, P ens (E), is shown at the bottom. On the right schematic 
of the free energy profiles are shown with the cartoons of HJ structures; the symbols (star, pentagon, 
. . .) at the junction emphasize that the junction structure is intact during the isomerization process. 
Hence, r^ n (f = a, f3 . . .) <C %bs < ^com? (£ V = a, /3, 7, . . . with £ ^ rf) is established. 
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FIG. 6: Mg 2+ pulse experiments to reset the molecular population in conformational space, a. Effect of 
Mg 2+ pulse experiment. Shown are three representative trajectories that change their pattern (E(t) or 
p s (E; i)) in response to Mg 2+ pulse sequence. The dashed lines depict the peak positions of PsiE; i) and 
underlies the differences mp s (E; i) before (red) and after (blue) washing off Mg 2+ ions. b. Depicted are 
Mg 2+ pulse-induced transition frequency matrix and diagram among kinetically disjoint states based 
on 148 FRET trajectories. The indices at the sides of matrix and in the nodes denote the cluster 
number k = 1,2... 5. The numbers in the parentheses are the occupation number in each cluster, 
which can be obtained by summing up the transition frequency from one cluster to the other. The 
diagram on the right is the kinetic network describing the HJ transition under Mg 2+ pulse. The widths 
of the arrows are in proportion to the number of transitions. 
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1. SINGLE MOLECULE FRET EXPERIMENTS 



To assemble the Holliday junction, the following DNA sequences were purchased from 
IDTDNA (Coralville, IA). 

R branch: 5'-biotin-TTTTTTTT CCCACCGCTCGGCTCAA^TGGGry 

H branch: 5'-Cy3- CCGTAGCAGCG CG AG CGGTGGG-3' 

X branch: 5'-GGGCGGCGACCT CCCAGTTGAGCGCTTGCTAGGG-3' 

B branch: 5'-Cy5-CCCTAGCAAGC CGCTGCTACGG -3' 
where the sequences with identical font style represent the pair of complementary DNA strands. 
The DNA strands were annealed by heating the mixture of DNAs (1 /iM, 20 jil) to 90 °C in TN 
buffer (10 mM Tris with 50 mM NaCl, pH 8.0), and slowly cooling down to 4 °C with 1 °C/min 
cooling rate. For single-molecule FRET (smFRET) measurements, a sample chamber was made 
between a cleaned quartz microscope slide (Finkenbeiner) and a cover slip using double-sided 
adhesive tape. DNAs were immobilized on quartz surface by successive additions of biotinylated 
BSA (40 /i/, 1 mg/ml, Sigma- Aldrich) , streptavidin (40 /x/, 0.2 mg/ml, Invitrogen), and DNA in 
TN buffer. The concentration of DNA was adjusted to achieve proper single molecule density. 
Each injection was incubated for 2 minutes, which was followed by washing with TN buffer. 
Experiments were performed at room temperature in a conventional imaging buffer (10 mM Tris- 
HC1 with 0.4% (w/v) glucose (Sigma-aldrich), 1% (v/v) trolox (Sigma- aldrich), lmg/ml glucose 
oxidase (Sigma-aldrich), 0.04 mg/ml catalase (Roche, Nutley, NJ), and designated magnesium 
ion) by using a home-built prism-type total internal reflection single-molecule FRET setup (Fig. 
SI). Specifically, a green laser (532 nm, Compase 215 M, Coherent) was used as an excitation 
source. Fluorescence signals of donor and acceptor were collected through a water-immersion 
objective (Olympus, UPlanoSApo 60x/1.2w), divided using dichroic mirror (Chroma, 635 dcxr) 
as a wavelength, and finally focused on different areas of an electron multiplier charge-coupled- 
device camera (Andor, Ixon DV897ECS-BV). To reliably detect individual transitions of HJ, 
optimum exposure time of the camera was selected at each Mg 2+ concentration (30 ms for 5 mM 
and 10 mM Mg 2+ concentrations, and 50 ms for other concentrations). Data acquisition and 
selection of single molecule traces were done by using home-made programs written in VC++6.0, 
and IDL (ITT), respectively. 

For Mg 2+ pulse experiment (Fig. S2), a teflon tube was connected to the exit hole of a 
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sample chamber filled with 50 mM Mg 2+ buffer, and a pipette tip filled with Mg 2+ -free imaging 
buffer was carefully plugged into the entrance hole. After starting a data acquisition, the buffer 
solution in the detection chamber was rapidly replaced with a Mg 2+ -free buffer solution in the 
pipette tip by pulling a syringe tube connected to the Teflon tube. Then, remaining solution 
in the pipette tip was carefully replaced with a 50 mM Mg 2+ buffer, and the new solution 
containing 50 mM Mg 2+ was rapidly introduced into the detection chamber by suction. During 
the whole process, data acquisition was maintained. The measured smFRET trajectories, under 
a range of counterion (Mg 2+ ) concentrations, were analyzed using concepts in glass physics and 
bioinformatic tools. 

2. MEMORY OF INITIAL CONDITION 

Another evidence for the inadequacy of using E(t) and P en8 (t) calculated over ensemble in 
unraveling the rugged folding landscapes can be given by comparing the patterns of FRET 
trajectories and the probability of observing such trajectories based on two-state picture. 
A series of dwell times of a trajectory generated from a system that precisely exhibits 
two-state kinetics should obey a renewal (more precisely a Poisson) process with dwell time 
distribution Pdweii(t) = r _1 e _t / r , where r is the mean dwell time. For such a system, the 
probability of observing a time trace with successively long multiple (n) dwells (r* 3> r), i.e., 
[f™ dtpd W eii(t)] n = exp (— nr*/r) is essentially zero (For n — 10 and r* = 5r, the probability of 
observing such a time trace is less than 10 -22 ). Nevertheless, our smFRET trajectories contain 
a preponderance of such cases, in which successively long multiple dwells are observed (see Fig. 
2a). Thus, even during frequent isomerization, a molecule behaves as if it retains "memory" 
of its basin of attraction [I], which implies that a given molecule repeatedly visits exactly the 
same states in iso-I and iso-IL 
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3. PROBING THE ERGODICITY BREAKING 



For an ensemble of smFRET trajectories that reports the FRET efficiency (E) as a function 
of time (£), the fluctuation metric (^(i)) is defined [2]: 

1 n 2 

^(t) = -Y,( £ ^-^) (!) 

1=1 

where = | dsEi(s) is the time average of for molecule z, and e(t) = J^Li ^(*) * s 
ensemble average of £i(t). In accord with the ergodic hypothesis we expect that, Ei(t —> oo) = 
lim^oo 1 / * dsEi(s) = ^ ^(t) = (E), which leads to lim^^ e(t) = ± J^ili e <(* 00 ) = 
(£") . Therefore, the necessary condition, lim^^ ^(t) — >> 0, should be fulfilled for ergodic 
systems. Eq.(l) for fig(t) can be rewritten as 

M*) = ^2 / dsi / rfs 2^^(^(si)-( J E))(^(s 2 )-(E)) = - j( d Sl jf ds 2 C(s ljS2 ) (2) 

where C(si,S2) is the equilibrium time correlation function. If (Ei(t) — (E)) is self- averaging, 
one can put C(si,S2) = C(\s 2 — Since in equilibrium there is no preferred origin of time 
C(si, 5 2 ) depends only on the difference between the two times si and s 2 . With straightforward 
algebra f^dsi J* ds 2 C(\s 2 — si\) = 2 f^dsi f* i ds 2 C(s 2 — Si) = 2 dr{t—T)C{r) one gets f2#(t) = 
t Jq dr (l — |) C(t) | Jq drC(r) for asymptotic limit of i. Therefore at large t the proposed 



2 
t 

metric behaves as 



^(0) ~ D E t K ' 



where D £ = lim^^ 



2f*ds(C(s)/C(0) 



-1 



Note that for ergodic systems, the equilibrium time 



correlation function C(s)/C(0) is a fast decaying function; thus the integral 2 J °° yields 
a constant. Next, the slope of the inverse of ^(^/^(O) yields effective diffusion coefficient, 
D E) the rate at which the molecule navigates the accessible conformational space. However, if 
the erogodicity of the system is effectively broken, [^(^/^(O)] -1 is no longer linear in time, 
but saturates into a constant value, which we observed for the ensemble of time trajectories 
that probe the isomerization dynamics of HJs. 
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4. CLUSTERING THE STATIONARY FRET DISTRIBUTION FUNCTIONS 



The set of stationary distributions calculated for the individual time traces of 
HJ isomerization, {p s (E\i)\i = 1, 2, • • • , iV}, can be partitioned into the K-clusters, 
S(E)=(Si(E), £2 (£"),. . . ,Sk(E)) with K < by employing K-means clustering algorithm 
that is widely used as a bioinformatics tool in analyzing the patterns of gene expression j3j 0] . 
We treated each stationary distribution (p s (E]i)) as a vector in the i?-space and iteratively 
minimized the sum of squared Euclidean distance from each p s (E;i) to its cluster centroid, i.e., 
Ylk=i Y, Ps (E;i)es k (E) \ \Ps(E]i) - Hk(E)\\ 2 where fi k (E) is the mean of S k (E). After the random 
assignment of the cluster centroid to a vector in E'-space, the K-means procedure iteratively 
refined partitioning of the p s {E\ i) among the K clusters by assigning each vector to the nearest 
cluster centroids, then recalculating cluster centroids if cluster membership (S(E)) was changed 
from the previous iteration. If no change in any cluster membership occurred then the K-means 
algorithm was terminated. To avoid local minima, the optimization was initiated from 200 
random cluster centroids. We selected the best set of clusters that reaches the minimum value 
among the 200 trials. 

The quality of clustering result can be assessed by using a silhouette function: 

max{a(z), 0(1) \ 

where a(i) is the average distance (dissimilarity) of a datum i with all other data within the 
same cluster, and b(i) is the lowest average dissimilarity of the datum i with the data belonging 
to other clusters. The optimal K value is determined by examining the mean silhouette function 
( s ) = jf ESLi against K. 



5. CLUSTERING AT THE OPTIMAL VALUES OF K 

At [Mg 2+ ]=50 mM, (s) becomes optimal at K=3, 5, and 13. However, the criteria using 
Euclidean distance or Pearson's correlation do not guarantee the ergodicity of each cluster. 
To this end we evaluated the fluctuation metric ^(O)/^^) for each cluster to ensure its 
ergodicity (Fig. S7). Although maximal (s) was found at K=3 (Fig. S7A), Q E (0)/Q E (t) of the 
cluster k=2 (red), which represents the 36.5 % of the time traces, was not linear in t (Fig. S7B). 
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Consequently, we had to inspect the next optimal values of K (K=5 and 13). 

6. CALCULATION OF ELECTROSTATIC POTENTIAL MAPS OF HJ STRUC- 
TURES 

Electrostatic potential maps of the three Holliday Junctions were calculated by solving the 
nonlinear Poisson-Boltzmann equation using the APBS software [5] at 200 mM monovalent 
ion condition with the dielectric constants for nucleic acids and solvent being 2.0 and 78.5, 
respectively. The calculated electrostatic potentials were visualized (Fig. 5a in the Main text) 
using the PyMol program ( |http:/ /w ww.pymol.org). 

7. CALCULATION OF RADIAL DISTRIBUTION FUNCTION OF MG 2+ IONS 
AROUND HJ 

To calculate radial distribution function of Mg 2+ around HJ, we performed all- atom 
molecular dynamics simulations using the crystal structure of HJ with PDB ID 1DCW. After 
solvating the HJ molecue in a 66A x 66A x66A box containing 8369 TIP3P water molecules, 
37 Na + , 41 Cl~, and 20 Mg 2+ ions were randomly placed. The system was minimized by using 
1 fs time step for 2000 steps with constraints on DNA coordinates, then for an additional 3000 
steps without constraints. During equilibration stage we gradually heated the system from 
K to 310 K using 2 fs time step for 620 ps with constraint, and equilibrated for an additional 
5 ns by using the NPT ensemble at 310 K and 1 atm. After equilibration, we generated a 
100 ns trajectory in the NPT ensemble at 310 K and 1 atm. The integration step during pro- 
duction run is 2 fs. The simulations were performed by using NAMD with CHARMM force field. 
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FIG. SI: A schematic diagram of experiments. DNA molecules were immobilized on a quartz surface 
via streptavidin-biotin interaction. Single-molecule images were taken in a prism-type TIRF (Total 
Internal Reflection Fluorescence) microscope. The fluorescence signals of Cy3 and Cy5 were collected 
by a water-immersion objective, and imaged on a separate areas of an EM-CCD camera (M : mirror, 
L : lens, F : filter, D : dichroic mirror). 
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FIG. S2: A schematic diagram of the magnesium pulse experiments. A sample chamber filled with 50 
mM Mg 2+ was prepared, and a pipette tip with mM Mg 2+ buffer inside was carefully plugged into 
a injection hole. While single-molecule images were taken, mM Mg 2+ buffer was rapidly introduced 
into the detection chamber by pulling a syringe pump connected to the exit hole of the detection 
chamber via a Teflon tube. Remaining solution in the pipette tip was carefully replaced with a 50 mM 
Mg 2+ buffer, and the new solution containing 50 mM Mg 2+ was rapidly introduced into the detection 
chamber by suction. During the whole process, data acquisition was maintained. 
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FIG. S3: (Top) Donor (Id), acceptor (I A ) signals and (bottom) the corresponding FRET efficiency 
(E(t) = I A {t)/(I A (t) + Io(t))) as a function of time. 
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FIG. S4: Histogram of FRET value distribution with varying Mg 2+ concentrations. Each P ens (E) is 
fit to double- Gaussian distribution (blue line), P ens (E) := (/)Gl(E) + (1 ~~ ^)Gh(E) where G\{E) — 
(27va < ^ x )~ 1 / 2 e ^ E Ex ^ / 2f7jE A (A = L,H) with the set of parameters shown in the box. Note that 
population of iso-I and iso-II are equally likely. Biomodal to uni-modal transition at a certain Mg 2+ 
ion concentration (< 50/iM, see Ref.[6j) is reminiscent of second order phase transition. In this case, 
Mg 2+ ions is analogous to inverse temperature in an Ising spin system (see Mg 2+ ion dependent free 
energy and phase diagram on the right). 
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FIG. S5: The probability of remaining in high (green) and low E values (blue). Red lines correspond 
to single exponentials. The mean life time in each state can be obtained using (r) = J °° drrpd we ii(T) = 
Jo°° drS ( T ) where S W = 1 - Jo d^Pdweii(r). 
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FIG. S6: Analysis of the transition dynamics of HJ trajectories at [Mg 2+ ]=50 mM shows that the 
ergodicity is still broken over an extended observation time T bs ~ 70 sec. e(t) and ^(^/^(O) 

(or [^(^/^(O)] -1 ) are shown in grey, green, and red lines, respectively. 
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FIG. S7: Clustering analysis of FRET trajectories at [Mg 2+ ]=50 mM. A. Mean silhouette function 
((«§)) as a function of the number of clusters, K. Optimal (s) values are found at K=3, 5, 13 as indicated 
by the arrows. B. Partitioning of {p s (E;i)} with K=3 yields three clusters (k=l (12.7 %), k=2 (36.5 
%), k=3 (50.8 %)) but the dynamics within the cluster, k=2, is still non-ergodic. f}£(0)/f2#(£) and 
scatter plot of ((tl^^thj)), partitioned into three clusters, are shown on the right. C. Partitioning 
of {p s (E;i)} with K=13 are shown. The cluster k=6 is still non-ergodic but its percentage (5.1 %) 
is small. ^#(0) De calculated for each k(= 1,2..., 13), and scatter plot of ((tl^^th^)), 
partitioned into 13 clusters, are shown on the right. 
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FIG. S8: Mg + pulse-induced transition frequency matrix among 13 kinetically disjoint states based 
on 148 FRET trajectories. The indexes at the sides of matrix denote the cluster number k = 1, 2 . . . 13. 
The number in the matrix element represents the number of transitions from state £ to state 77 induced 
by the sequence of Mg 2+ pulses (see Fig.S2 and Fig. 6a). 



35 




FIG. S9: All the histograms of FRET efficiency, P(E), under 150, 500, 1000 mM monovalent salt 
condition using NaCl but in the absence of Mg 2+ ions are uni-modal, suggesting that monovalent ions 
do not induce the two-state like isomerization dynamics in HJs. This control experiment suggests that 
isomerization dynamics and heterogeneity of this dynamics are related to specific DNA-Mg binding 
events, not the screening effects due to the high ionic strength in 50 mM MgCl2 condition. 
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FIG. S10: Control experiment — Mg 2+ -pulse experiments using donor-only-tagged HJs. The Mg 2+ 
is switched at 40 s (50 mM —> mM) and 65 sec (0 mM 50 mM). It is of note that although 
the photon intensity itself varies from molecule-to-molecule and there are weak gradual drift in the 
data, the overall intensities from the donor dyes are constantly maintained irrespective of the Mg 2+ 
concentration. It is hard to see the change in the pattern of data as in Fig. 6a when only the donor dyes 
are tagged to the molecule. This observation reinforces our hypothesis that the HJ dynamics observed 
with sm-FRET are originated from the conformational dynamics of HJs, not due to the dye- nucleotide 
stacking or heterogeneous interaction with surface. 
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